Building an English-iraqi Arabic machine translation system for spoken utterances with limited resources

نویسندگان

  • Jason Riesa
  • Behrang Mohit
  • Kevin Knight
  • Daniel Marcu
چکیده

This paper presents an English-Iraqi Arabic speech-to-speech statistical machine translation system using limited resources. In it, we explore the constraints involved, how we endeavored to mitigate such problems as a non-standard orthography and a highly inflected grammar, and discuss leveraging existing plentiful resources for Modern Standard Arabic to assist in this task. These combined techniques yield a reduction in unknown words at translation time by over 40% and a +3.65 increase in BLEU score over a previous state-of-the-art system using the same parallel training corpus of spoken utterances.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Morphological Modeling for Machine Translation of English-Iraqi Arabic Spoken Dialogs

This paper addresses the problem of morphological modeling in statistical speech-tospeech translation for English to Iraqi Arabic. An analysis of user data from a real-time MT-based dialog system showed that generating correct verbal inflections is a key problem for this language pair. We approach this problem by enriching the training data with morphological information derived from sourceside...

متن کامل

Colloquial Iraqi ASR for speech translation

In this paper we describe a real-time speech recognition system developed for colloquial Iraqi Arabic. This system is currently used in our speech-to-speech translation system configured for bi-directional communication in English and Iraqi on a laptop. We present experimental results on Iraqi utterances from different speech-to-speech translation domains, and analyze the usefulness of acoustic...

متن کامل

The BBN 2007 displayless English/iraqi speech-to-speech translation system

Spoken communication across a language barrier is of increasing importance in both civilian and military applications. In this paper, we present an English/Iraqi Arabic speech-to-speech translation system for the military force protection domain (checkpoints, municipal services surveys, basic descriptions of people, houses, vehicles, etc). The system combines statistical N-gram speech recogniti...

متن کامل

Improvements in machine translation for English/iraqi speech translation

In this paper, we describe techniques for improving machine translation quality in the context of speech-to-speech translation for significantly different language pairs. Specifically, we explore three broad approaches for improving translation from English to Iraqi and vice versa. First, we investigate normalization techniques which address the differences in spoken and written forms of both l...

متن کامل

A Hybrid Phrase-based/Statistical

Spoken communication across a language barrier is of increasing importance in both civilian and military applications. In this paper, we present a system for taskdirected 2-way communication between speakers of English and Iraqi colloquial Arabic. The application domain of the system is force protection. The system supports translingual dialogue in areas that include municipal services surveys,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006